Biostatistics For Dummies (Monika Wahi John Pezzullo)

The p value is the probability that random fluctuations alone could produce a t value at least as

large as the value you just calculated based upon the Student t distribution.

The Student t statistic is always calculated using the general equation D/SE. Each specific type of t test

we discussed earlier — including one-group, paired, unpaired, and Welch — calculates D, SE, and df

slightly differently. These different calculations are summarized in Table 11-1.

TABLE 11-1 How t Tests Calculate Difference, Standard Error, and

Degrees of Freedom

One-Group

Paired

Unpaired t Equal Variance

Welch t Unequal Variance

Difference between mean of

observations and a hypothesized value

(h)

Mean of

paired

differences

Difference between means of the two

groups

Difference between means of the two

groups

SE SE of the observations

SE of paired

differences

SE of difference, based on a pooled

estimate of SD within each group

SE of difference, from SE of each

mean, by propagation of errors

Number of observations – 1

Number of

pairs – 1

Total number of observations – 2

“Effective” df, based on the size and

SD of the two groups

Executing a t test

Statistical software packages contain commands that can execute (or run) t tests (see Chapter 4

for more about these packages). The examples presented here use R, and in this section, we

explain the data structure required for running the various t tests in R. For demonstration, we use

data from the National Health and Nutrition Examination Survey (NHANES) from 2017–2020

file (available at wwwn.cdc.gov/nchs/nhanes/continuousnhanes/default.aspx?

Cycle=2017-2020).

For the one-group t test, you need the column of data containing the variable whose mean you

want to compare to the hypothesized value (H), and you need to know H. R and other software

enable you to specify a value for H and assumes 0 if you don’t specify anything. In the NHANES

data, the fasting glucose variable is LBXGLU, so the R code to test the mean fasting glucose

against a maximum healthy level of 100 mg/dL in an R dataframe named GLUCOSE is

t.test(GLUCOSE$LBXGLU, mu = 100).

For the paired t test, you need two columns of data representing the pair of numbers you want to

enter into the paired t test. For example, in NHANES, systolic blood pressure (SBP) was

measured in the same participant twice (variables BPXOSY1 and BPXOSY2). To compare these

with a paired t test in an R dataframe named BP, the code is t.test(BP$BPXOSY1, BP$BPXOSY2,

paired = TRUE).

For the independent t test, you need to have one column coded as the grouping variable

(preferable with a two-state flag coded as 0 and 1), and another column with the value you want to

test. We created a two-state flag in the NHANES data called MARRIED where 1 = married and 0

= all other marital statuses. To compare mean fasting glucose level between these two groups in

a dataframe named NHANES, we used this code: t.test(NHANES$LBXGLU ~